Session 4: Statistical Language Modeling
نویسنده
چکیده
Corpus based Natural Language Processing (NLP) is now a well established paradigm in NLP. The availability of large corpora, often annotated in various way has led to the development of a variety of approaches to statistical language modeling. The papers in this session represent many of these important approaches. I will try to classify these papers along different dimensions, thus providing the reader an overview as well as some understanding of the future directions of the work in this area.
منابع مشابه
Dynamic Web log session identification with statistical language models
on statistical language modeling. Unlike standard timeout methods, which use fixed time thresholds for session identification, we use an information theoretic approach that yields more robust results for identifying session boundaries. We evaluate our new approach by learning interesting association rules from the segmented session files. We then compare the performance of our approach to three...
متن کاملSession 2: Language Modeling
This session presented four interesting papers on statistical language modeling aimed for improved large-vocabulary speech recognition. The basic problem in language modeling is to derive accurate underlying representations from a large amount of training data, which shares the same fundamental problem as acoustic modeling. As demonstrated in this session, many techniques used for acoustic mode...
متن کاملUser Modeling of Parallel Workloads
The goal of workload modeling is to simulate the expected workload, accurately enough to enable making correct design and administrative decisions. Several statistical features of production parallel computer workloads, which are not embodied in current models, have been identified. Their practical importance is demonstrated by two new kinds of schedulers – a key component in determining the ov...
متن کاملSession 11 - Natural Language III
The five papers in this session, as well as the ten papers in the other two natural language sessions, can be classified into three broad categories: (1) statistical approaches to natural language processing and the automatic acquisition of linguistic structure (2 out of 5 papers in this session; 8 out of 15 overall); (2) robust processing of texts by combining multiple partial analyses (2 out ...
متن کاملSession 8: Statistical Language Modeling
Over the past several years, the successful application of statistical techniques in natural language processing has penetrated further and further into written language technology, proceding with time from the periphery of written language processing into deeper and deeper aspects of language processing. At the periphery of natural language understanding, Hidden Markov Models were first applie...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1992